Feature Weighting Strategies in Sentiment Analysis

نویسندگان

  • Olena Kummer
  • Jacques Savoy
چکیده

In this paper we propose an adaptation of the KullbackLeibler divergence score for the task of sentiment and opinion classification on a sentence level. We propose to use the obtained score with the SVM model using different thresholds for pruning the feature set. We argue that the pruning of the feature set for the task of sentiment analysis (SA) may be detrimental to classifiers performance on short text. As an alternative approach, we consider a simple additive scheme that takes into account all of the features. Accuracy rates over 10 fold cross-validation indicate that the latter approach outperforms the SVM classification scheme.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Systematic Literature Review of Sentiment Analysis Techniques

Development of Web 2.0 has resulted in enormous increase in the vast source of opinionated user generated data. Sentiment Analysis includes extracting, grasping, arranging and presenting the feelings or suppositions communicated in the information gathered from the clients. This paper exhibits an efficient writing survey of different strategies of sentiment analysis. A model for sentiment analy...

متن کامل

A Study of Information Retrieval Weighting Schemes for Sentiment Analysis

Most sentiment analysis approaches use as baseline a support vector machines (SVM) classifier with binary unigram weights. In this paper, we explore whether more sophisticated feature weighting schemes from Information Retrieval can enhance classification accuracy. We show that variants of the classic tf.idf scheme adapted to sentiment analysis provide significant increases in accuracy, especia...

متن کامل

Exploring Feature Definition and Selection for Sentiment Classifiers

In this paper, we systematically explore feature definition and selection strategies for sentiment polarity classification. We begin by exploring basic questions, such as whether to use stemming, term frequency versus binary weighting, negation-enriched features, n-grams or phrases. We then move onto more complex aspects including feature selection using frequency-based vocabulary trimming, par...

متن کامل

RTRGO: Enhancing the GU-MLT-LT System for Sentiment Analysis of Short Messages

This paper describes the enhancements made to our GU-MLT-LT system (Günther and Furrer, 2013) for the SemEval-2014 re-run of the SemEval-2013 shared task on sentiment analysis in Twitter. The changes include the usage of a Twitter-specific tokenizer, additional features and sentiment lexica, feature weighting and random subspace learning. The improvements result in an increase of 4.18 F-measure...

متن کامل

Feature-based Sentiment Analysis Approach for Product Reviews

The researches and applications of sentiment analysis become increasingly important with the rapid growth of online reviews. But traditional sentiment analysis models have been lacking in concern on the modifying relationship between words for sentiment analysis of Chinese reviews, and limit the development of opinion mining. This paper proposes a feature-based vector model and a novel weightin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012